If you don’t already have these packages installed, run this code
install.packages("plotly")
install.packages("ggplot2")
install.packages("dplyr")
Run this code, Required pacakges for activity
library(plotly)
## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Introduction
Data visualization is a very important tool in the field of Data Science. It helps people share the findings of there research/analysis with a wide variety of people from different backgrounds through effective visuals. At Macalester, the most common data viz.package we are taught in R is ggplot. This package allows us to input a data frame and turn it into a graphic where the user can specify certain aesthetic parameters to create a desired visualization.
For this activity, we will be building upon our knowledge of data visualization and learn a new skill called plotly, “an Interactive web-based data visualization” that can be used in R and python. This package allows us to take in data and turn it into a interactive visualization that can enhance the message of the analysis. Through the completion of this activity and reflection points, the hope is that you will learn a new skill that you can add to your bag of tricks.
Before begining this activity, please look through this article that covers the basics and syntax of plotting with plotly. Throughout this activity, if anything is unclear, please look back at this reference for code help. - https://plotly-r.com/
Section 1: Basics
For this activity, we will be using the mtcars data set that is built into r.
glimpse(mtcars)
## Rows: 32
## Columns: 11
## $ mpg <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8,…
## $ cyl <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8,…
## $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 16…
## $ hp <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180…
## $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92,…
## $ wt <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.…
## $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18…
## $ vs <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,…
## $ am <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0,…
## $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3,…
## $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2,…
As mentioned above, we have commonly been taught to use ggplot to make visualizations, tt is an effective package that allows users a vast amount of options. Included below is a scatter plot using ggplot and the mtcars dataset.
ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
geom_point()+
labs(
x = "Weight (X thousand lbs)", y = "Miles per gallon",
title = "Fuel effiency by weight", color = "Cylinders") +
theme_minimal()
reflection
What are the certain parts of the ggplot call doing to make the visual?
What can be improved to this graph to make it more informative?
Answer here:
Section 1, part b
Just like ggplot, Plotly allows us to create visualizations, but with slightly different formatting. Below are several examples of common graph types—this time using Plotly.
As you run each cell, interact with the plots:
Try:
Zooming in (double-click to zoom back out)
Hovering over points/bars
Thinking about what information becomes easier to see interactively
plot_ly(
data = mtcars, x = ~wt, y = ~mpg, type = "scatter", mode = "markers") %>%
layout(title = "Scatter Plot: Weight vs. MPG", xaxis = list(title = "Weight (thousand lbs)"), yaxis = list(title = "Miles per Gallon"))
plot_ly(
data = mtcars, x = ~factor(cyl), type = "histogram") %>%
layout( title = "Histogram of Cylinder Count", xaxis = list(title = "Number of Cylinders"), yaxis = list(title = "Count of Cars"))
mtcars_sorted <- mtcars %>% arrange(hp)
plot_ly(
data = mtcars_sorted, x = ~hp, y = ~mpg, type = "scatter", mode = "lines") %>%
layout(title = "Line Plot: MPG Across Increasing Horsepower", xaxis = list(title = "Horsepower"), yaxis = list(title = "Miles per Gallon"))
plot_ly(
data = mtcars, x = ~wt, y = ~mpg, z = ~hp, color = ~factor(cyl), type = "scatter3d", mode = "markers") %>%
layout(
title = "3D Scatter Plot: Weight, MPG, and Horsepower", scene = list( xaxis = list(title = "Weight (thousand lbs)"), yaxis = list(title = "Miles per Gallon"), zaxis = list(title = "Horsepower")),
legend = list(title = list(text = "Cylinders")))
reflection
How does interactivity (hover, zoom, filtering) change the way you understand the information in these graphs compared to static ggplot visuals?
what types of data scenarios do you think using Plotly would add meaningful value vs. when a static ggplot might be more appropriate?
Answer here:
Section 2: Making the graph interactive
Returning to the original ggplot scatterplot, there are two ways to convert it into an interactive Plotly visualization.
Approach 1. using ggplot again and letting plotly handle it. As you can see, it is the same code we used with the ggplot calls which creates this great interactive. If you hover over the points, you are able to see the what the points axis points are.
p <- ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
geom_point() +
labs(
x = "Weight (X thousand lbs)",
y = "Miles per gallon",
title = "Fuel efficiency by weight",
color = "Cylinders"
) +
theme_minimal()
# ggplotly(p)
Approach 2. This approach uses Plotly syntax from the start, giving more control over features like hover labels, legends, colors, and marker styling.
plot_ly(
data = mtcars,
x = ~wt,
y = ~mpg,
color = ~factor(cyl),
colors = "Set1",
type = "scatter",
mode = "markers",
marker = list(size = 10)
) %>%
layout(
title = "Fuel efficiency by weight",
xaxis = list(title = "Weight (X thousand lbs)"),
yaxis = list(title = "Miles per gallon"),
legend = list(title = list(text = "Cylinders"))
)
This activity only introduces the basics of Plotly, but the package offers many additional tools and visualization types. We encourage you to explore further and try out different interactive features and plot options.
Key Takeaways: - Plotly uses similar concepts as ggplot but with different syntax
Basic plots can be created with few lines of code (its pretty easy to make it interactive!)
Interactivity makes it different from ggplot(hovering, zooming, and rotating to help make patterns more clear)
Plotly is most ideal when you want the user to explore and learn on there own a bit
Section 3: Adding Hover Labels, Color, and Customization
So far, we’ve used Plotly to make basic interactive plots where you can zoom and hover. But the default hover labels and colors aren’t always the most helpful.
Plotly lets you:
Customize hover labels: decide exactly which variables appear when you hover over a point, and how they’re formatted (e.g., “Weight: 2.5, MPG: 22”).
Control color: map color to a variable (like number of cylinders) and choose a color palette that makes groups easy to compare.
Style markers: change size, opacity, and sometimes symbol, which helps highlight important points or avoid overplotting.
This combination makes your graph feel more like a little “data app” — each hover tells a small story about that specific car.
Here’s a customized scatterplot using mtcars. We’ll:
cyl (number of cylinders)plot_ly(
data = mtcars,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
color = ~factor(cyl),
colors = "Dark2",
marker = list(size = 12, opacity = 0.8),
text = ~paste(
"Model:", rownames(mtcars),
"<br>Weight:", wt,
"<br>MPG:", mpg,
"<br>Horsepower:", hp,
"<br>Gears:", gear
),
hoverinfo = "text"
) %>%
layout(
title = "Fuel Efficiency with Custom Hover Labels",
xaxis = list(title = "Weight (thousand lbs)"),
yaxis = list(title = "Miles per Gallon"),
legend = list(title = list(text = "Cylinders"))
)
When you hover over points now, you don’t just see numbers — you get a mini “profile” of each car.
Exercise
Goal: Practice customizing hover labels and styling so the plot tells a clearer story.
Start from the example above (you can copy/paste it).
Make the following three changes:
qsec or carb).color.size, opacity, or symbol.Use this chunk as your starting point:
# Exercise: Customize hover labels and styling
plot_ly(
data = mtcars,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
# TODO: choose your color mapping and palette
color = ~factor(cyl),
colors = "Set1",
marker = list(
size = 10, # you can change this
opacity = 0.9 # and this
),
text = ~paste(
# TODO: customize your hover text
"Model:", rownames(mtcars),
"<br>Weight:", wt,
"<br>MPG:", mpg
),
hoverinfo = "text"
) %>%
layout(
title = "Your Customized Interactive Plot",
xaxis = list(title = "Weight (thousand lbs)"),
yaxis = list(title = "Miles per Gallon")
)
Short reflection
Answer here:
Section 5: Filter Function & Zooming
Interactivity isn’t just about pretty hover labels — it’s also about controlling which data you see and how closely you look at it.
There are two main ideas here:
Filtering the data before plotting Using
dplyr::filter(), you can focus on a subset of the data
(e.g., only cars with high MPG, or only 4-cylinder cars). This makes
your interactive plot more targeted.
Zooming and panning inside Plotly Once the plot is rendered, you can:
Combining filtering + zooming lets you move between “big picture” and “close-up” views of your dataset.
Example A: Filter to only 4-cylinder cars
mtcars_4cyl <- mtcars %>%
filter(cyl == 4)
plot_ly(
data = mtcars_4cyl,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
marker = list(size = 10),
color = ~factor(gear)
) %>%
layout(
title = "Filtered Plot: Only 4-Cylinder Cars",
xaxis = list(title = "Weight"),
yaxis = list(title = "Miles per Gallon"),
legend = list(title = list(text = "Gears"))
)
Example B: Zooming on an unfiltered plot
plot_ly(
data = mtcars,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
color = ~factor(cyl)
) %>%
layout(
title = "Try Zooming and Panning (Drag to zoom, double-click to reset)",
xaxis = list(title = "Weight"),
yaxis = list(title = "Miles per Gallon")
)
Try: click-and-drag to zoom into a cluster of points, then double-click to reset.
Exercise
Goal: Use both filtering and zooming to explore a subset of cars more deeply.
Use dplyr::filter() to pick one subset of
interest, such as:
mpg > 25hp > 150cyl == 6)am == 1)Make an interactive scatterplot of wt vs
mpg for that subset.
Interact with it:
Write 1–2 sentences about something you saw that you might not have noticed in the full dataset.
Scaffolded code:
# Exercise: Filter + zoom exploration
# 1. Filter the dataset (change this line to your own filter condition)
mtcars_subset <- mtcars %>%
filter(mpg > 25) # <-- edit this condition
# 2. Make an interactive scatterplot
plot_ly(
data = mtcars_subset,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
color = ~factor(cyl),
marker = list(size = 10)
) %>%
layout(
title = "Filtered & Zoomable Plot",
xaxis = list(title = "Weight"),
yaxis = list(title = "Miles per Gallon")
)
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
Underneath, have them answer:
Reflection:
Answer here:
Section 6: Design Your Own Plotly Visualization
Goal: Now it’s your turn to design your own interactive visualization!
Task:
Option A: keep using mtcars
#your_data <- mtcars
Option B: replace with your own dataset, e.g.:
#your_data <- readr::read_csv("my_data.csv")
# TODO: change these to variables that make sense for your question
x_var <- ~wt
y_var <- ~mpg
color_var <- ~factor(cyl)
plot_ly(
data = your_data,
x = x_var,
y = y_var,
type = "___",
mode = "___",
color = color_var,
marker = list()
%>%
layout(
title = "_______",
xaxis = list(title = "______"),
yaxis = list(title = "______")
)
Reflection:
Answer here:
Section 7: Putting It Together: A Mini Dashboard (Subplots)
Plotly also makes it very easy to combine multiple interactive charts into a single dashboard-style view. This is useful when you want to compare several patterns at once, show different summaries of the same dataset, or tell a bigger story with your data.
We’ll use the function subplot() to place several
plot_ly() objects in one layout.
Let’s build three separate plots, then combine them.
# 1. Scatter: weight vs. MPG, colored by cylinders
p_scatter <- plot_ly(
data = mtcars,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
color = ~factor(cyl),
colors = "Set1",
marker = list(size = 10, opacity = 0.8),
text = ~paste(
"Model:", rownames(mtcars),
"<br>Weight:", wt,
"<br>MPG:", mpg,
"<br>Cylinders:", cyl
),
hoverinfo = "text"
) %>%
layout(
title = "Weight vs MPG",
xaxis = list(title = "Weight"),
yaxis = list(title = "Miles per Gallon")
)
# 2. Boxplot: MPG by cylinder count
p_box <- plot_ly(
data = mtcars,
x = ~factor(cyl),
y = ~mpg,
type = "box",
color = ~factor(cyl),
colors = "Set1"
) %>%
layout(
title = "MPG Distribution by Cylinders",
xaxis = list(title = "Cylinders"),
yaxis = list(title = "MPG")
)
# 3. Bar chart: count of cars by # of gears
gear_counts <- mtcars %>%
count(gear)
p_bar <- plot_ly(
data = gear_counts,
x = ~factor(gear),
y = ~n,
type = "bar"
) %>%
layout(
title = "Number of Cars by Gears",
xaxis = list(title = "Gears"),
yaxis = list(title = "Count of cars")
)
# Combine them into a 2x2 dashboard (one empty cell)
dashboard <-subplot(
p_scatter, p_box, # first row
p_bar, NULL, # second row (empty space on right)
nrows = 2, # Arrange plots in 2 rows
titleX = TRUE,
titleY = TRUE,
margin = 0.05 # space between plots
)
dashboard
Try: change the parameters and make it the way you like!
We can further improve readability by adjusting the overall layout
dashboard %>%
layout(
title = list(
text = "Dashboard: Exploring mtcars",
font = list(size = 22)
),
paper_bgcolor = "white", # background for whole dashboard
plot_bgcolor = "rgba(245,245,245,0.6)" # background of each subplot
)
Reflection
Answer here:
Section 8: Linking Plots Plotly also supports linked brushing, which means:
# Scatter plot: relationship between weight and mpg
p1 <- plot_ly(
data = mtcars,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
color = ~factor(cyl),
text = ~paste("Model:", rownames(mtcars)),
hoverinfo = "text"
) %>%
layout(title = "Weight vs MPG (colored by cylinders)")
# Boxplot: MPG distribution by cylinders
p2 <- plot_ly(
data = mtcars,
x = ~factor(cyl),
y = ~mpg,
type = "box",
color = ~factor(cyl)
) %>%
layout(title = "MPG Distribution by Cylinders")
# Histogram: distribution of car weights
p3 <- plot_ly(
data = mtcars,
x = ~wt,
type = "histogram"
) %>%
layout(title = "Distribution of Car Weights")
subplot(p1, p2, p3, nrows = 1, margin = 0.05)
crosstalk is a package that allows different Plotly plots to communicate with each other. It enables: - linked brushing (select points in one plot → highlights in others) shared filtering
library(crosstalk)
# Add a shared key to link the plots
shared <- SharedData$new(mtcars)
# Linked scatter plot
pA <- plot_ly(
shared,
x = ~wt,
y = ~mpg,
type = "scatter",
mode = "markers",
color = ~factor(cyl)
) %>%
layout(title = "Scatter: Weight vs MPG")
# Linked boxplot
pB <- plot_ly(
shared,
x = ~factor(cyl),
y = ~mpg,
type = "box",
color = ~factor(cyl)
) %>%
layout(title = "Boxplot: MPG by Cylinders")
# Linked histogram
pC <- plot_ly(
shared,
x = ~wt,
type = "histogram"
) %>%
layout(title = "Histogram: Weight Distribution")
subplot(pA, pB, pC, nrows = 1)
Reflection
Answer here:
Section 8: Optional Challenge
Use the new skills you learned from this lesson to build your own mini interactive dashboard using a dataset you think is interesting and answer a question that is meaningful.
AI useage
AI assistance was used to improve grammar, clarity, and spelling.
AI assistance was used for formatting